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Abstract 

A guessing wiretapper's performance on a Shannon cipher system is analyzed for a source with memory. Close relationships 
between guessing functions and length functions are first established. Subsequently, asymptotically optimal encryption and attack 
strategies are identified and their performances analyzed for sources with memory. The performance metrics are exponents 
of guessing moments and probability of large deviations. The metrics are then characterized for unifilar sources. Universal 
asymptotically optimal encryption and attack strategies are also identified for unifilar sources. Guessing in the increasing order 
of Lempel-Ziv coding lengths is proposed for finite-state sources, and shown to be asymptotically optimal. Finally, competitive 
optimality properties of guessing in the increasing order of description lengths and Lempel-Ziv coding lengths are demonstrated. 
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L INTRODUCTION 

We consider the classical Shannon cipher system [1]. Let X" — {Xi,--- ,X„) be a message where each letter takes 
values on a finite set X. This message should be communicated securely from a transmitter to a receiver, both of which have 
access to a common secure key U'^ of k purely random bits independent of X". The transmitter computes the cryptogram 
Y — fn{X^, U^) and sends it to the receiver over a public channel. The cryptogram may be of variable length. The function 
fn is invertible given . The receiver, knowing Y and U'^, computes X" = /,7^(y, C/*^). The functions /„ and are 
published. An attacker (wiretapper) has access to the cryptogram Y , knows /„ and and attempts to identify X" without 
Q knowledge of U'^ . The attacker can use knowledge of the statistics of X". We assume that the attacker has a test mechanism 
that tells him whether a guess X" is correct or not. For example, the attacker may wish to attack an encrypted password 
. or personal information to gain access to, say, a computer account, or a bank account via internet, or a classified database 
J> [2]. In these situations, successful entry into the system or a failure provides the natural test mechanism. We assume that 
in the attacker is allowed an unlimited number of guesses. Given the probability mass function (PMF) of X", the function /„, 
and the cryptogram Y, the attacker can determine the posterior probabilities of the message I v)- His best guessing 

strategy having observed F = y is then to guess in the decreasing order of these posterior probabilities Px^\y{' I v)- The key 
rate for the system is k/n = R which represents the number of bits of key used to communicate one message letter, 
f — Merhav and Arikan [2] study discrete memoryless sources (DMS) in the above setting and characterize the best attainable 
moments of the number of guesses that the attacker has to submit before success. In particular, they show that for a DMS with 
the governing single letter PMF P on X, the value of the optimal guessing exponent is given by 
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E{R, p) = max [pmin{H{Q), R] - D{Q \\ P)] , 
Q 



where the maximization is over all PMFs Q on X, H{Q) is the Shannon entropy of the PMF Q, and D{Q \\ P) is the 
Kullback-Leibler divergence between Q and P. They also show that E{R, p) equals pR for R < H{P), and equals the 
constant pHi/(i^p){P) for R > H{Pp). When R < H{P), the key rate is not sufficiently large, and an exhaustive key- 
search attack is asymptotically optimal. When R > H{Pp), the randomness introduced by the key is near perfect, and the 
cryptogram is useless to the attacker. The attacker submits guesses based directly on the message statistics, and pHi/(i^p){P) 
is known to be the optimal guessing exponent in this scenario [3], where Hl/^l^p■^{P) is the Renyi entropy of the DMS P. For 
H{P) < R < H{Pp), the optimal strategy makes use of both the key and the message statistics. Pp is the PMF of an auxiliary 
DMS given by ( |47] |. Merhav and Arikan [2] also determine the best achievable performance based on the large deviations of 
the number of guesses for success, and show that it equals the Fenchel-Legendre transform of E{R, p) as a function of p. 

Secret messages typically come from the natural languages which can be well-modeled as sources with memory, for e.g., a 
Markov source of an appropriate order In this paper, we extend the results of Merhav and Arikan [2] to sources with memory. 
As a first step towards this, we first consider the perfect secrecy scenario (for e.g., those analogous to i? > H{Pp) in the DMS 
case), and identify a tight relationship between the number of guesses for success and a lossless source coding length function. 
Specifically, we sandwich the number of guesses on either side by a suitable length function. Arikan's result [3] that the best 
value of the guessing exponent for memoryless sources is the Renyi entropy of an appropriate order immediately follows by 
recognizing that it is the least value of an average exponential coding length problem proposed and solved by Campbell [4]. Our 
approach based on length functions has the benefit of showing that guessing in the increasing order of lengths of compressed 
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strings can yield a good attack strategy for sources with memory. In particular, guessing in the increasing order of Lempel-Ziv 
code lengths [5] for finite-state sources and increasing description lengths for unifilar sources [6] are asymptotically optimal 
in a sense made precise in the sequel. 

Next, we establish similar connections between guessing and source compression for the key-constrained scenarios (i.e., 
those analogous to i? < H{Pp) in the memoryless case). We then study guessing exponents for the cipher system on sources 
with memory, and then specialize our results to show that all conclusions of Merhav and Arikan in [2] for memoryless sources 
extend to unifilar sources. We also consider the large deviations performance of the number of guesses and show that attacks 
based on the Lempel-Ziv coding lengths and minimum description lengths are asymptotically optimal for finite-state and unifilar 
sources, respectively. We then establish competitive optimality results for guessing based on these two length functions. 

The paper is organized as follows. In Section|ll]we study guessing under perfect secrecy and establish the relationship between 
guessing and source compression. In Section |III1 we study the key-rate constrained system, establish optimal strategies for both 
parties for sources with memory, and study the relationship between guessing and a new source coding problem. In Section 
IIVI we characterize the performance for unifilar sources. In Section [Vl we study the large deviations performance and establish 
the optimality properties of guessing based on Lempel-Ziv and minimum description lengths. Section |Vl] summarizes the paper 
and presents some open problems. 



II. Guessing under perfect secrecy and source compression 

Let us first consider the following ideal setting where k — nR > ?ilog|X|. Enumerate all the sequences in X" from to 
|X|" — 1 and let the function /„ be the bit- wise XOR of the key bits and the bits representing the index of the message. 
The cryptogram is the message whose index is the output of /„. The decryption function is also clear - simply XOR the bits 
representing the cryptogram with the key bits. Such an encryption renders the cryptogram completely useless to an attacker 
who does not have knowledge of the key. The attacker's optimal strategy is to guess the message in the decreasing order of 
message probabilities. In case the attacker does not have access to the message probabilities, a robust strategy is needed. We 
first relate the problem of guessing to one of source compression. As we will see soon, robust source compression strategies 
lead to robust guessing strategies. 

For ease of exposition, and because we have perfect encryption, let us assume that the message space is simply X. The 
extension to strings of length n is straightforward. 

A guessing function 

G:X^{1,2,... ,|X|} 

is a bijection that denotes the order in which the elements of X are guessed. If G{x) = i, then the ith guess is x. A length 
function 

L:X^N 

is one that satisfies Kraft's inequality 

^2--^(^)<L (1) 

a:eX 

To each guessing function G, we associate a PMF Qg on X and a length function Lq as follows. 
Definition 1: Given a guessing function G, we say Qg defined by 

Qaix) = • Gix)-\ Vx e X, (2) 

is the PMF on X associated with G. The quantity c in (|2]) is the normalization constant. We say Lq defined by 

Lg(x) = r-loggG(a;)l , VxeX, (3) 

is the length function associated with G. □ 
Observe that 

|X| 

c = ^G(a)-i < l + ln|X|, (4) 

aeX 1=1 

and therefore the PMF in (|2]l is well-defined. We record the intimate relationship between these associated quantities in the 
following result. 

Proposition 2: Given a guessing function G, the associated quantities satisfy 

■ Qcixy' ^ G{x) < QG{xr\ (5) 
LG{x)~l~\ogc<\ogG{x)<LG{x). (6) 

□ 

Proof: The first equality in Q follows from the definition in (|2|l, and the second inequality from the fact that c > 1. 
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The upper bound in (|6]l follows from the upper bound in (|5]) and from ([3j. The lower bound in (|6]) follows from 

logG(a;) = log(c-i -QgIx)-!) 

= -log(5G(a;) - logc 

> (r-logQG(a:)l -l)-logc 

= iG(a;) - 1 - lege. 

■ 

We now associate a guessing function Gl to each length function L. 

Definition 3: Given a length function L, we define the associated guessing function Gl to be the one that guesses in the 
increasing order of L-lengths. Messages with the same L-length are ordered using an arbitrary fixed rule, say the lexicographic 
order on X. We also define the associated PMF on X to be 

2-L{x) 

- 2-L(a) - (7) 

□ 

Proposition 4: For a length function L, the associated PMF and the guessing function satisfy the following: 

1) Gl guesses messages in the decreasing order of Q^-probabilities; 

2) 

\ogGL{x)<\ogQL{x)-^ <L{x). (8) 

□ 

Proof: The first statement is clear from the definition of Gl and from O. 
Letting 1{E} denote the indicator function of an event E, we have as a consequence of statement 1) that 



Gl{x) < }^l{QL{a)>QL{x)} 



= QLix)-\ (9) 

which proves the left inequality in (O. This inequality was known to Wyner [7]. 
The last inequality in ^ follows from ^ and Kraft's inequality ([TJ as follows: 

■ 

Let {L{x) > B} denote the set {x G X | L{x) > B}. We then have the following easy to verify corollary to Propositions 
|2]and|l] 

Corollary 5: For a given G, its associated length function Lq, and any S > 1, we have 

{Lg{x) > B + 1 + logc} 
C {G{x) > 2^} 

C{Lg{x)>B}. (10) 
Analogously, for a given L, its associated guessing function Gl, and any i? > 1, we have 

{GLix)>2^}C{L{x)>B}. (11) 

□ 

The inequalities between the associates in ^ and ([8]l indicate the direct relationship between guessing moments and 
Campbell's coding problem [4], and that the Renyi entropies are the optimal growth exponents for guessing moments. See (fl4] i 
below. They also establish a simple and new result: the minimum expected value of the logarithm of the number of guesses 
is close to the Shannon entropy. 

We now demonstrate other relationships between guessing moments and average exponential coding lengths which will be 
useful in establishing universality properties. 

Proposition 6: Let L be any length function on X, Gl the guessing function associated with L, P a. PMF on X, p e (0, oo), 
L* the length function that minimizes E \2p^ '^■'], where the expectation is with respect to P, G* the guessing function that 
proceeds in the decreasing order of P-probabilities and therefore the one that minimizes E [G*{XY\, and c as in (|4|i. Then 

E[G,(X)P] ^ E[2P^W] 
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Analogously, let G be any guessing function, and Lq its associated length function. Then 

E[G(X)P] >]E|2P^:!^.2-p(i+iogc)^ 



Also, 



E[G*(X)p] - E [2PL'(X)] 

- logE [G*{X)P] - i logE hp^'^^^i 
P P L 



< 1 + log c. 



Proof: Observe that 



E 



2PL(X) 



> E 

> E 



G*iX)P] 



> E 



2PL'{X) 



)-p(l+logc) 



(13) 

(14) 

□ 

(15) 

(16) 
(17) 



where ( fTSl ) follows from and ( IT6] ) from the left inequality in (|6]l. The result in (fT2l i immediately follows. A similar argument 
shows ( fT3] l. Finally, (O follows from the inequalities leading to (fTTI i by setting L — L*. ■ 

Thus if we have a length function whose performance is close to optimal, then its associated guessing function is close to 
guessing optimal. The converse is true as well. Moreover, the optimal guessing exponent is within 1 + log c of the optimal 
coding exponent for the length function. 

Let us now consider strings of length n. Let X" denote the set of messages and consider n oo. It is now easy to see that 
universality in the average exponential coding rate sense implies existence of a universal guessing strategy that achieves the 
optimal exponent for guessing. For each source in the class, let P„ be its restriction to strings of length n and let L* denote an 
optimal length function that attains the minimum value E [2''-^"(-^")] among all length functions, the expectation being with 
respect to Pn- On the other hand, let L„ be a sequence of length functions for the class of sources that does not depend on 
the actual source within the class. Suppose further that the length sequence i„ is asymptotically optimal, i.e., 



1 



lim — log E 

n^oo np 

= lim — log E 

n^oo np 



for every source belonging to the class. L„ is thus "univeral" for (i.e., asymptotically optimal for all sources in) the class. An 
application of (fT2l l by denoting c in (fT2l l as c„ followed by the observation (1 + logc„)/ri shows that the sequence of 
guessing strategies Gl„ is asymptotically optimal for the class, i.e., 

lim — logEiGi^X")"] 



np 



1 



lim — logE[G*(X")''] . 

n^co np 

Arikan and Merhav [8] provide a universal guessing strategy for the class of discrete memoryless sources (DMS). For the 
class of unifilar sources with a known number of states, the minimum description length encoding is asymptotically optimal 
for Campbell's coding length problem (see Merhav [6]). It follows as a consequence of the above argument that guessing in 
the increasing order of description lengths is asymptotically optimal. (See also the development in Section HVT ). The left side 
of ( fT2b is the extra factor in the expected number of guesses (relative to the optimal value) due to lack of knowledge of the 
specific source in class. Our prior work [9] characterizes this loss as a function of the uncertainty class. 



III. Guessing with key-rate constraints and source compression 

We continue to consider strings of length n. Let X" be a message and the secure key of purely random bits independent 
of X". Recall that the transmitter computes the cryptogram Y = /„(X", U^) and sends it to the receiver over a pubUc channel. 
Given a PMF of X", the function /„, and the cryptogram Y, the attacker's optimal strategy is to guess in the decreasing 
order of posterior probabilities I J/)- Let us denote this optimal attack strategy as G/„. The key rate for the system is 

k/n = R < log |X|. If the attacker does not know the source statistics, a robust guessing strategy is needed. The following is 
a first step in this direction. 

Proposition 7: Let L„ be an arbitrary length function on X". There is a guessing list G such that for any encryption function 
/„, we have 

G(x" I y) < 2 min |2"-", 2^"^^") } . 
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□ 

Proof: We use a technique of Merhav and Arikan [2]. Let Gl„ denote the associated guessing function that proceeds 
in the increasing order of the lengths and completely ignores the cryptogram. Let Gl„ proceed in the order x", , • • • . By 
Proposition 12] we need at most 2^"^^"^ guesses to identify x". 

Consider the alternative exhaustive key-search attack defined by the following guessing list: 

where u'i,U2, - ■ ■ is an arbitrary ordering of the keys. This strategy identifies a;" in at most 2"^ guesses. 

Finally, let G{- \ y) be the list that alternates between the two lists, skipping those already guessed, i.e., the one that proceeds 
in the order 

{x-„f-'{y,u\),x-2,f-'{y,ul),---]. (18) 

Clearly, for every x", we need at most twice the minimum of the two original lists. ■ 
We now look at a weak converse to the above in the expected sense. Our proof also suggests an asymptotically optimal 

encryption strategy for sources with memory. 

Proposition 8: Fix n G N, p > 0, and let c„ denote the constant in (|4]l as a function of n with X" replacing X. There is an 

encryption function /„ and a length function i„ such that every guessing strategy G(- | y) (and in particular Gf^) satisfies 

E[G(X" I Y)P] 

> 7 A rE[fmin(2^"(^"),2"^|y" . 

- (2c„)''(2 + p) LV I ' /y J 

□ 

Proof: The proof is an extension of Merhav and Arikan's proof of [2, Th. 1] to sources with memory. The idea is to 
identify an encryption mechanism that maps messages of roughly equal probability to each other. 

Let Pn be any PMF on X". Enumerate the elements of X" in the decreasing order of their probabilities. For convenience, 
let M — 2"^. If M does not divide |X|", append a few dummy messages of zero probability to make the number of messages 
N a multiple of M. Index the messages from to — 1. Henceforth, we identify a message by its index. 

Divide the messages into groups of M so that message m belongs to group Tj, where j = [m/Mj, and [-J is the floor 
function. Enumerate the key streams from to M — 1, so that < u < M — 1. The function is now defined as follows. 
For m = jM + i set 

fn{jM + i,u)=jM +{i(Du), 

where i © u is the bit-wise XOR operation. Thus messages in group Tj are encrypted to messages in the same group. The 
index i identifying the specific message in group Tj, i.e., the last nR bits of to, are encrypted via bit-wise XOR with the key 
stream. Given u and the cryptogram, decryption is clear - perform bit-wise XOR with u on the last nR bits of y. 

Given a cryptogram y, the only information that the attacker gleans is that the message belongs to the group determined by 
y. Indeed, if y E Tj 

Pn{Y = y} = j^Pn{X''^T,} 

and therefore 

{ P„{X"=m} I /Afl — ■ 
0, otherwise, 

decreases with to for to e Tj, and is for to ^ Tj. The attacker's best strategy Gy„(- | y) is therefore to restrict his guesses 
to Tj and guess in the order jM,jM + 1, • • • , jAf + A/ — 1. Thus, when x" = jM + i, the optimal attack strategy requires 
i + 1 guesses. 
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We now analyze the performance of this attack strategy as follows. 



E[G/„(X"|r)''] 

N/M-lM-1 

j=0 1=0 
N/M-l M-l 

^ E E = + 1)^^ - + 1)' 

N/M-l 



> J2 Pn{X-^ij + l)M-l}- 



3=0 



P 



N/M-l M-l 

^ TT^ E E^"{^" = (j + l)M + z}M^ 

^ j=0 i=0 



AT-l 



=M 

where ( |T9] l follows because the arrangement in the decreasing order of probabilities implies that 

P„{X" = jM + i} > P„{X" = (j + l)Af - 1} 
for i = 0, • • • , Af — 1. Inequality ( l20b follows because 

^(z + l)'' = ^i''> / rfz 
i=0 i=l "^^ 

([2TI 1 follows because by the decreasing probability arrangement 



1 + P 



M-l 



P„{X» = [j + 1)M - 1} > 1: ^ P„{X" = {j + 1)M + l) 



i=0 



Thus (l22b implies that 



N-l 



P„{X" = m} (min{m + 1, M})^ 



m=0 



N-l 



P„{X" = m}(m + 1)^ + ^ P„{X = m}MP 



m=M 



< E [G/„ (x"|y)''] + (1 + p)E [G/J^'^ir)"] 

= (2 + p)E[G/„(X"|y)''], 



(19) 
(20) 



(21) 
(22) 



(23) 



Set Gp to be the guessing function that guesses in the decreasing order of P-probabilities without regard to Y, i.e., Gp{m) = 
m + 1. Let Lcp be the associated length function. Now use (|23] l and (|6]l to get 

EiG^jx-ir)"] 

> ^^E[(min{Gp(X"),M})''] 



> 



> 



2 + p 
1 

2 + p~ 



E 



M 



(2c„)p(2 + p) 



E 



(^min|2^'^p(^"\M|)'' 



Since G/„ is the strategy that minimizes E [G(X" | F)''] , the proof is complete. 
For a given p > 0, key rate P > 0, encryption function /„, define 

K(P,P) =supilogE[G/„(X" I YY]. 
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Propositions Q and |8] naturally suggest the following coding problem: identify 

"(min{2^"(^"),2"^})' 



En,i {R, p) = mill - log E 

L„ n 



(24) 



Analogous to (O, we can relate En{R, p) and Enj{R, p) for a specified key rate R. The following is a corollary to Propositions 
I2]and[8] 

Corollary 9: For a given R, p > 0, we have 

log(22p<(2 + p)) 



\E„,i{R, p) - E„{R, p)\ < 



n 



□ 



Proof: Let L* be the length function that achieves En.i{R,p)- By Proposition |2l and after taking expectations, we have 
the guessing strategy G{- \ y) that satisfies 

"(min{2^"(^"),2"«})''' 



E 



> suplE[G(X"|yn 

> suplE[G/„(X"|>^n 



> 



1 



2^Pc'ni2 + p) 



E 



(min{2^"(^"),2"«})' 



for a particular /„ and L„ guaranteed by Proposition [8] 

1 



> 



-E 



oo. Thus, the problem of finding the 



fmiii(2^"(^"),2"«|y" 

22pc^(2+p) LV I ' jy . 

Take logarithms and normalize by n to get the bound. 

The magnitude of the difference between En{R, p) and En.i{R, p) vanishes as n 
optimal guessing exponent is the same as that of finding the optimal exponent for a coding problem. When R > log |X|, the 
coding problem in (l24l i reduces to the one considered by Campbell in [4]. Proposition |7] shows that the optimal length function 
attaining the minimum in ( l24l l yields an asymptotically optimal attack strategy on the cipher system. Moreover, the encryption 
strategy in Proposition [8] is asymptotically optimal. 

The following Proposition upper bounds the guessing effort needed to identify the correct message for sources with memory. 
A sharper result analogous to the DMS case is shown later for unifilar sources. 

Proposition 10: For a given R, p > 0, we have 

lim sup En {R, p) < mill I pR, lim sup En (p) > , (25) 

n — *oo n — ^oo J 

where 



En{p) = mill- log E 
Ln n 



□ 



Proof: By Corollary |9l it is sufficient to show that the sequence En.i{R, p) is upperbounded by the sequence on the right 
side of dZST l. Let i* be the length function that minimizes E [2P-f'"(-^")]. Observe that min {2P"-^, x} is a concave function 
of X for a fixed p and R. Jensen's inequality then yields 



E 



,E 



X" 



Take logarithms, normalize by n, and use the definition of En,i{p, R) to get 



EnAR.P) < ^ log (mill {2P"^,E 



1 



mill < pR, — log I 
n 



2PKix") 



}) 



Now take the limsup as n ^ oo to complete the proof. ■ 
Our results thus far are applicable to a rather general class of sources with memory. In the next section, we specialize our 
results to the important class of unifilar sources. If the source is a DMS with defining PMF P, then the second term within 
the min in (IZSl l is known to be pHi/(^i_^_p-j{P), where iii/(i+p)(P) is Renyi's entropy of order 1/(1 + p) for the source. For 
unifilar sources, we soon show that the limsup can be replaced by a limit which equals p times a generalization of the Renyi 
entropy for such a source. 
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IV. Unifilar Sources 

In this section, we generalize the DMS results of Merhav and Arikan [2] to unifilar sources. We first make some definitions 
largely following Merhav's notation in [6]. 

Let a;" — {xi, ■ ■ ■ , x„) be a string taking values in X". The string a;" needs to be guessed. Let s" = (si, • • • , s„) be another 
sequence taking values in S" where |S| < oo. Let sq S § be a fixed initial state. A probabilistic source P„ is finite-state with 
|§| states [6] if the probability of observing the sequence pair (a;", s") is given by 



where P{xi, Si \ Si_i) is the joint probability of letter Xi and state Si given the previous state Si-i. The dependence of P„ on 
the initial state sq is implicit. Typically, the letter sequence x" is observable and the state sequence s" is not. Let H denote 
the entropy-rate of a finite-state source, i.e., 

H = - lim V P„(x")logP„(a;"). 

n — >oo ^ — ^ 

A finite-state source is unifilar [10, p. 187] if the state is given by a deterministic mapping : X x S ^ S as 

Si = (/)(a;i, s,_i), 

and the mapping x i— > (j)[x, s) is one-to-one for each s g §. Given sq and the sequence x", the state sequence is uniquely 
determined. Moreover, given sq and the state sequence s", a;" is uniquely determined. An important example of a unifilar 
source is a fcth order Markov source where Si — {xi, Xi^i, • • • , Xi^k+i)- 
Fix a;" e X". For s e S, x e X, let 

1 " 

Qx^{x, s) = - l{a;i = a;, Si^i = s}, 
1=1 

where 1{A} is the indicator function of the event A. Qj.^ is thus an empirical PMF on § x X. Let 

Qx^{s) = ^ Qx^-{x, S). 

The use of Qx^ for both the joint and the marginal PMFs is an abuse of notation. The context should make the meaning clear 
Let 

(x, s)/Qxn (s), Qx'^ (s) > 0, 
0, Q."(s) = 



Qx^ (x I s) 



denote the empirical letter probability given the state. (Given that (p is one-to-one, this actually defines a transition probability 
matrix on the state space). Denote the empirical conditional entropy as 

H{Qx^) = -^^Qx^{x,s)logqx^{x\s), 

ses xex 

and the conditional Kullback-Leibler divergence between the empirical conditional PMF and the one-step transition matrix 

P(a;|s) as 

DiQx^ II EE 

Given that we are dealing with multiple random variables, H{Q) and D{Q \\ P) usually stand for joint entropy and Kullback- 
Leibler divergence of a pair of joint distributions. We however alert the reader that they stand for conditional values in our 
notation. 

Let us further define the type Tx-r^ of a sequence a;" as follows: 

r,„={a"eX"|Qa"=Q."}- 

For the unifilar source under consideration, it is easy to see that 

F„(a;") = 2-^WQ.-)+D{Q^,A\P)) ^ (26) 

i.e., all elements of the same type have the same probability. Moreover, for a fixed type Qx^, if we set P{x \ s) = qx"{x \ s) 
and observe that for the resulting unifilar source matched to a;", we have 1 > Pn{Tx^} = \Tx'^\Pnix'^), we easily deduce 
from (OSll that 

|r^„| < 2"-f^('3=""). (27) 

'The definition in [6] does not resfiict </> to be one-to-one. 
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Consequently, for any unifilar P„, 



(28) 



Using the fact that the mapping x i-^ (j){x, s) is one-to-one for each s, it is possible to get the following useful lower bounds 
on the size and probability of a type for unifilar sources. 

Lemma 11: (Merhav [6, Lemma 1], Gutman [11, Lemma 1]) For a unifilar source, there exists a sequence e(n) = 
<d{n^^ logn) such that 



-iogP„{r,„} + i?(g,„ II p) 

n 



<e{n) 



(29) 



for every e X". 

Consequently, we also have ([6, eqn. (17)]): 



□ 



■log|T,„| -i/(Q,„) 



< e{n). 



(30) 



Let us now define in a fashion analogous to the DMS case 



E{R, p) = max [ph{Q, R) - D{Q \\ P)] 

Q 



(31) 



where h{Q, i?) = mm{H (Q) , R}, Q is a joint PMF on § x X with letter probabilities given the state identified by q{x \ s), 
and H{Q) is the conditional entropy 

P{x\s) is the conditional PMF that defines the unifilar source. The string sq is irrelevant in the definition of E{R,p). 
We now state and prove a generalization of the Merhav and Arikan result [2, Th. 1]. 
Theorem 12: For any unifilar source, any p > 0, 

lim En{R,p)=E{R,p). 



□ 

Proof: We show that the limiting value of Enj{R,p) exists for the corresponding coding problem and equals E{R,p). 
Corollary |9] then implies that En{R, p) for the guessing problem has the same limiting value. 

Let Ln be a minimal length function that attains En.i{R,p)- Arrange the elements of X" in the decreasing order of their 
probabilities. Furthermore, ensure that all sequences belonging to the same type occur together. Enumerate the sequences from 
to |X|" — 1. Henceforth we refer to a message by its index. 

We claim that we may assume L„ is a nondecreasing function of the message index. Suppose this is not the case. Let j be the 
first index where the nondecreasing property is violated, i.e. < + 1) for i — 1, - ■ ■ ,j — 1, and Ln{j) > Ln{j + 1). 

Identify the smallest index j* that satisfies Ln{j*) > Ln{j + 1). Modify the lengths as follows: set L'^{j*) = Ln{j + 1), 
then + 1) = for i — j*, - ■ ■ ,j, and leave the rest unchanged. Call the new set of lengths i„. In effect, we have 

"bubbled" Ln{j + 1) towards the smaller indices to the nearest location that does not violate the nondecreasing condition. The 



new set of lengths will have the same or lower E 



By the optimality of the original set of lengths. 



(min{2^"(-^",2"-«})' 

the new lengths are also optimal. Furthermore, as a consequence of the fnodification, the location of the first index where 
Ln{i) ^ Ln{i + 1) has strictly increased. Continue the process until it terminates; it will after a finite number of steps. The 
resulting i„ is nondecreasing and optimal. 
Next, observe that 



2^"W >i + i 



(32) 



because the length functions are such that the sequences are uniquely decipherable. Another way to see (I32b is to observe that 
index i is the i + 1st guess when guessing in the increasing order of L„ as prescribed by the indices, and therefore ^ implies 
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We then have the following sequence of inequalities 

J2 F„(a")(min{2^"('^"),2"«})'' 

> Pn{x") (min{2^"(""),2"^})'' (33) 

io(T,,.) + |T,n|-l 

> P„(x") (min{z + l,2"^})'' (34) 

> P„(x") Y (min{i,2"^})'' (35) 

> Pni^n (min{y,2"«})'' 

"'0 

> P„(x")|T,„|-^(min{|T,n|,2"«})'' (36) 

> p{T^„}_J_ (^niin|2"-f^('5-")-""("),2"-"})'' (37) 

o — 2ne(n) 

> f 2"(''"""t^('3-")^-">--°('3-"ll^)) (38) 

~ 1 + p 

where ( [33] l follows by restricting the sum to sequences in type T^^, (|34] l follows because of (|32T l and by setting jo(T3,,i) as 

the starting index of type T^n. We can do this because our ordering clustered all sequences of the same type. Inequality ( |35] l 
holds because every term under the summation is lower bounded by the corresponding term on the right side. Inequality (|36] | 
follows because of the following. For simplicity, let \Tx'^ \ = N and 2"^ ~ Al. When N < M, 



and when N > M, 



1 

— / (min{y,Af})'' dy 
1 f^^ 1 



M MP ( , M 



> 



N 1 + p \ N 
MP 



MP 



1 + P 

Inequality (l37b follows from (l30b and ( |38] | follows from 

The type Ta;^ in ( [38] l is arbitrary. Moreover, D{Q \\ P) and H{Q) are continuous functions of Q, and the set of rational 
empirical functions {Qx^} become dense in the class of unifilar sources with |S| states and |X| alphabets, n oo. From 
(|38] | and the above facts, we get liminf„^oo En,i{R,p) > E{R,p). 

To show the other direction, we define a universal encoding for the class of unifilar sources on state space S with alphabet X. 
Given a sequence x", encode each one of the |§|(|X| — 1) source parameters {qx"{x \ s)} estimated from x". Each parameter 
requires log(n + 1) bits. Then use nH{Qx'^) bits to encode the index of a;" within the type T^^. The resulting description 
length can be set to 

= nHiQx^) + |§|(|X| - 1) log(n + 1), 

where we have ignored constants arising from integral length constraints. We call this strategy the minimum description length 
coding and i* the minimum description lengths. 

L* depends on x" only through its type T^^. Moreover, there are at most {n + l)l^l(l^l^i) types. Using these facts, ( |27T i. 
and ( |28] l, we get 



E 



min 



(39) 

< + (40) 

• max P{T:,4 min 1 2""^'^-" \ 2""-" I (41) 

< (,j + i)(i+p)|S|(|x|-i)2«bWp). (42) 
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Take logarithms and normalize by n to get 



n — *-oo 



This completes the proof. ■ 
The minimum description length coding works without knowledge of the true source parameters. Knowledge of the transition 
function <j) is sufficient. In the context of guessing, the optimal attack strategy does not depend on knowledge of the source 
parameters. Interlacing the exhaustive key-search attack with the attack based on increasing description lengths is asymptotically 
optimal. Incidentally, the encryption strategy of Merhav and Arikan [2, Th. 1] uses only type information for encoding, and 
is applicable to unifilar sources. The same arguments in the proof of [2, Th. 1] go to show that their encryption strategy is 
asymptotically optimal for unifilar sources. 
Let us define the quantity 

E{p) = max [pH{Q) - D{Q \\ P)] . (43) 
Q 

Observe that E{p) = E{R,p) for R > log|X|, i.e., E{p) determines the guessing exponent under perfect encryption. The 
following result identifies useful properties of these functions. 

Proposition 13: E{p) is a convex function of p. E{p, R) is a convex function of p and a concave function of R. □ 
Proof: Equation ( l43T l is a maximum of affine functions of p and is therefore convex in p. The same is the case for 
E{R,p). To see the concavity of E{R,p) in R, write dsTT l as done in [2, Sec. IV] as 



EiR,p) 



max 
Q 



pmin^ [9H{Q) + {p - e)R] - D{Q \\ P) 



= max^min [6'i7(g) + (p-e')i?-L>(Q II P)] 

= min max {OHIQ) + ip - e)R - D(Q \\ P)] (44) 

o<e<p Q 

= min \E(9) + (p - 9)R)] . (45) 

O<0<p 

The maximization and minimization interchange in ( l44l l is justified because the term within square brackets, sum of a scaled 
conditional entropy and the negative of a conditional divergence, is indeed concave in Q and affine in 9. Since (|45] | is a 
minimum of affine functions in R, it is concave in R. ■ 
It is easy to see the following fact for a unifilar source: 

lim-logf y P„(a;")i/(i+''M = E{p). (46) 

That the left side in ( |46] l is at least as large as the right side follows from the proof in [6, Appendix B] and the observation 
that pH{Q) — D{Q \\ P) is continuous in Q and that the set of rational empirical PMFs Q^^ is dense in the set of unifilar 
sources with state space § and alphabet X, as n ^ oo. The other direction is an easy application of the method of types. The 
initial state which is implicit in P„ does not affect the value of the limit (as one naturally expects in this Markov case). In the 
memoryless case, i.e., when Si — xi, and P{x\s) is independent of s, this quantity converges to E{p) = pHi/t^i^p){P) where 
-ffi/(i+p)(P) is the Renyi entropy of the DMS P on X. 

Analogous to a DMS case, we can characterize the behavior of E{R, p) as a function of R for a particular source P. 

Proposition 14: For a given p > {) and a unifilar source, let E'{p) exist. Then 

( pR, R< H, 

EiR,p)=\ {p-9o)R + Ei9o), H<R<E'{p), 
I E{p), R>E\p) 

where 6*0 G [0, p] in the second case. □ 
Proof: Indeed, from (|45] | it is clear by the continuity of the term within square brackets that for all values of R, 
E{R, p) = {p — 9q)R + E{9o) for some 9o E [0, p], and the second case is directly proved. 

Suppose R < H. Then we may choose Q = P in ( |3TI ) to get E{R, p) > pR. However, ( |25] ) indicates that E{R, p) < pR, 
which leads us to conclude that E{R, p) = pR when R < H. 

Next observe that E{R,p) < E{p) is direct for all values of R, and in particular for R > E'{p). To show the reverse 
direction, ( |45]) yields 

EiR, p) ^ min \E{9) + (p - 9)R\ 
o<e<p 

o<e<p \ p — ) 
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The proof will be complete if we can show that the term within parentheses is nonnegative for < 9 < p. This holds because 
of the following. By the convexity of E{6), the largest value of {E{p) — E{9))/{p — 9) for the given range of 9 is E'{p) (see 
for example, Royden [12, Lemma 5.5.16]), and this is upper bounded by R. ■ 
For a DMS, Merhav and Arikan [2] show that E'{p) H{Pp), where Pp is the PMF given by 

P(a;)i/(i+p) 

They also show that 9^ is the unique solution to i? = H{Pg). 



V. Large Deviations Performance 



A. General Sources With Memory 

We now study the problem of large deviations in guessing and its relation to source compression. Our goal is to extend 
the large deviations results of Merhav and Arikan [2] to sources with memory using the tight relationship between guessing 
functions and length functions. We begin with the following general result. 

Proposition 15: 1) When B > R > 0, there is an attack strategy that satisfies 

supP„ {G(X" I Y) > 2"^} = 

for all sufficiently large n. 
2) When B < R, there is an attack strategy that satisfies 



supP„{G(X" |>')>2"^} 

< minP„{L„(X") > nS- 1}. 



3) When B < R, there is an encryption function /„ such that 

p„{G/„(x" |r)>2"^} 



> - •minP„{i„(X") > nS + l + logc„}. 



Remarks: When B = R, the large deviations behavior of guessing and coding may differ. If we define 



and 



=inf 



Fn,i{B) = max 

in 



1 



-logP„{G/„(X"|r)>2"^} 



■logP„{L„(X")>2"^} 



□ 



(48) 



(49) 



then Fn{R, B) ^ oo for all sufficiendy large n if i? < B. When R > B, F„(i?, B) is bounded between Fn,i{B - 1/n) and 
Fn,i{B + (1 + logc„)/n)) ignoring vanishing terms. 

Proof: Observe first that for any encryption function, the strategy (fTSl l requires at most 2"^+^ guesses. If B > R, 
2nB 2"^+i for all sufficiently large n, and therefore 

supP„{G(X"|y)>2"^} =0. 

When B < R, the same strategy with an optimal Ln that minimizes /',i{L„(X") > nB — 1} requires G(x" | y) < 
2min{2'^(^"),2"^} guesses. Hence 



and therefore 



{G(x" I y) > 2"^} C {i„(x") >nB-l} 



P„{G(X" I Y) > 2"^} < P„{i„(X") >nB- 1}. 



Since this is true for any encryption function /„, the second statement follows. The attack G(- | y) given by dTSI l interlaces 
guesses in the increasing order of the L„ that attains the minimum in min/,^ P„ {L„(X") > nB — 1} with the exhaustive 
key-search strategy. 
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Next, let B < R and consider the encryption strategy given in the proof of Proposition [8] with N = M\\X\" /M~\ (with 
dummy messages possibly appended) and AI = 2"^. Let Gp^ denote guessing in the increasing order of f„-probabilities. 
Once again we refer to messages by their indices. For the optimal guessing strategy G/„, we have 

P„{G/„(X" I Y) > 2"-^} 

N/M-l M-1 
= E E Pn{X"=jM + i} 

N/M~l 

> ^ Pn {X" = (j + 1)M - 1} (Af - 2"-^) 

> E E^"{^" = o- + i)^^+^}^^^ 
E 

7V-1 



iV-1 



2 

m=M 

where the last inequality follows because B < R. (When B — R, the lower bound is and this technique does not work). 
Also, rather trivially, 

Af-l 
m=2"-B-l 

Putting these together, we get 

N-l 

PniX^'^m} = P„{Gp„(X")>2"^} 

m=2"S-l 

< 3P„{G/„(X" I r) > 2"^} . 
Since {Lgp^{x") > nB + 1 + logc„} C {Gp„(x") > 2"-^}, we get 

^n{G/„(X"|y)>2"^} 

> i -F^liGpJ^") >r^S + l + logc„} 

> i • min P„{L„(X") > + 1 + log c„}, 

and this concludes the proof. ■ 



B. Unifilar Sources 

In this subsection, we specialize the result of Proposition [15] to unifilar sources. 
Corollary 16: For a unifilar source. 



where 



F{B) = min D{Q 11 P) 

Q:H(Q)>B 



is the source coding error exponent for the unifilar source. □ 
Proof: This follows straightforwardly from the remarks immediately following Proposition [15] if we can show that 
lim,i^oo Fn,i{B) — F{B) and that F{B) is continuous in (0, log |X|). This was proved by Merhav in [6, Sec. III]. ■ 
We remark that the optimal attack strategy does not depend on the source parameters. Guessing in the increasing order 
description lengths, interlaced with the exhaustive key-search attack is an asymptotically optimal attack. Furthermore, as is the 
case for guessing moments, the encryption strategy of Merhav and Arikan [2, Th. 2] is easily verified to be an asymptotically 
optimal encryption strategy for unifilar sources when B < R. 
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E{R,p) and F{R,B) for unifilar sources are related via the Fenchel-Legendre transform, i.e., 

E{R, p) = sup [pB - F{R, B)] 

B>0 

and 

F{R, B) = sup [pB - E{R, p)] . 

p>0 

The proof is identical to that of [2, Th. 3] where this result is proved for DMSs. 
C. Finite-State Sources 

We now consider the larger class of finite state sources. The Lempel-Ziv coding strategy [5] asymptotically achieves the 
entropy rate of a finite-state source without knowledge of the source parameters. It is therefore natural to consider its use in 
attacking a cipher system that attempts to securely transmit a message put out by a finite-state source. Our next goal is to 
show that guessing in the increasing order of Lempel-Ziv coding lengths has an interesting universality property. 

Let Ulz ■ X" N be the length function for the Lempel-Ziv code [5]. The following theorem due to Merhav [6] indicates 
that the Lempel-Ziv algorithm is asymptotically optimal in achieving the minimum probability of buffer overflow. 

Theorem 17 (Merhav [6]): For any length function L„, every finite-state source P„, every Bn E (ni/, n log |X|) where H 
is the entropy -rate of the source P„, and all sufficiently large n, 

Pn{ULz{X^)>Bn + ne{n)} 

< (1 + Sin)) ■ Pn{Ln{X^) > Bn} (50) 

where e{n) = Q{l/y/\ogn) is a positive sequence that depends on |X| and |§|, and S{n) = n^2^"'^("^. □ 
Remark: Merhav's result [6, Th. 1] assumes that Bn — nB for a constant B G {H,\og |X|), but the proof is valid for any 

sequence Bn £ (niJ, n log |X|). 

Let Glz be the short-hand notation for the more cumbersome Gu^z^ the guessing function associated with Ulz- Let c„ 

be as given in (|4| with X" replacing X. Furthermore, for the key-constrained cipher system, let Glz{' \ y) denote the attack 

of guessing in the order prescribed by Glz interlaced with the exhaustive key-search attack. Observe that Glz{- \ y) needs 

knowledge of /„. 

Theorem 18: For any guessing function Gn, every finite-state source P„, every B £ {H, log |X|) where H is the entropy-rate 
of the source P„, and all sufficiently large n, 

Pn{n~HogGLz{X") >B + e{n)+j{n)} 

< {l + S{n))-Pn{n-HogGn{X'')>B} (51) 

where e{n) and d{n) are the sequences in ( |50l l. and 7(71) = (1 + logc„)/n = 0(n^^ logn). 

For the key-rate constrained cipher system, let B < R. Then for any encryption function, we have 

Pn \ogGLziX" \Y)>B + l/n + £(n) + 7(n)} 
< 3(l + (5(n)) •supF„{n^MogG/„(X" \Y)>B} 

(52) 

for all sufficiently large n. □ 
Remark: Thus the Lempel-Ziv coding strategy provides an asymptotically optimal universal attack strategy for the class of 
finite-state sources, in the sense of attaining the hmiting value of (|48] |. if the limit exists. 
Proof: Observe that 

(l + 5(n))P„{G„(X")>2"^} 

> {l + S{n))Pn{LGAX'')>nB + l + logc„} (53) 

> P„{C/Lz(X")>nB + l + logc„+ne(n)} (54) 

> Pn [Glz{X") > 2"-B+n£(n)+n7(")| ^ (55) 

where ( l53T l follows from the first inclusion in (fTOl i. and ( l54l l from ( fSOl l. The last inequality i55[ follows from (fTTl i. This proves 
the first part. 

To show the second part, we use Proposition [TS] 3 and Theorem [TtI as follows: for all sufficiently large n, 

3(l + J(n))supP„{G/„(X" |r)>2"^} 

> (l + (5(n))P„{i„(X") > nP + n7(n)} 

> Pn{ULz{X'')>nB + nj{n) + ne{n)} 

> Pn [Glz{X'' I Y) > 2"-B+l+«7(")+"e(n)| 
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where the last inequality holds for any arbitrary encryption function with Glz{- \ y) being the interlaced attack strategy. ■ 
Observe that e{n) + 7(n) = 0(l/v^Togn). For unifilar sources, a result analogous to Theorem [Ts] can be shown with 
e{n) + j{n) = Q{n^^ logn). Guessing for this class of sources proceeds in the order of increasing description lengths. This 
conclusion follows from a result analogous to Theorem [17] on the asymptotic optimality of minimum description coding (see 
Merhav [6, Sec. III]). 

D. Competitive Optimality 

We now demonstrate a competitive optimality property for Glz- From [6, eqn. (28)] extended to finite-state sources, we 
have for any competing code i„ 

P„{C/lz(X") > L„(X") +n£(n)} 

< Fn{C/Lz(^") <i„(^")+ne(n)} (56) 

where e{n) ~ 0((loglogn)/(logn)). From (O and ©, we get 

C/Lz(a;") >logGLz(x") 

and 

logG(x")>LG(a;")-l-logc„, 

respectively. We therefore conclude that 

{logGLz(x") > logG(.T") +n(eH +7(n))} 
C {Ulz{x'')> Laixn+nein)} 

and that 

{ULz{x'')<LG{x^)+ne{n)} 

C {\ogGLz{xn < \ogG{x") + n{e{n) + ^{n))}. 

From these two inclusions and (|56] |, we easily deduce the following result. 

Theorem 19: For any finite-state source and any competing guessing function G, we have 

Pn{\ogGLz{X^) > \ogG{X")+ne'{n)} 

< Pn{\ogGLz{X^) < logG(X") + ne'in)} 

where e'{n) = e{n) + 7(n). □ 
For unifilar sources, the above sequence of arguments for minimum description length coding and [6, eqn. (28)] imply that 
we may take e'{n) = Q{n~^ logn). 

VI. Concluding Remarks 

In this paper, we studied two measures of cryptographic security based on guessing, for sources with memory. The first one 
was based on guessing moments and the second on large deviations performance of the number of guesses. We identified an 
asymptotically optimal encryption strategy that orders the messages in the decreasing order of their probabilities, enumerates 
them, and then encrypts as many least-significant bits as there are key bits. We also identified an optimal attack strategy based 
on a length function that attains the optimal value for a source coding problem. Both these strategies need knowledge of the 
message probabilities. 

We then specialized our results to the case of unifilar sources, gave formulas for computing the two measures of performance, 
and argued that the optimal encryption strategy as well as the optimal attack strategy depended on the source parameters only 
through the number of states and letters, i.e., the optimal encryption and attack strategies are universal for this class. 

We also showed that an attack strategy based on the Lempel-Ziv coding lengths is asymptotically optimal for the class 
of finite state sources. Finally, we provided competitive optimality results for guessing in the order of increasing description 
lengths and Lempel-Ziv lengths. 

We end this paper with a short list of related open problems. 

• Consider a modification to the encryption technique of Proposition [8] where the messages are enumerated in the increasing 
order of their Lempel-Ziv lengths instead of message probabilities. Does this ordering lead to an asymptotically optimal 
encryption strategy? Such a strategy would not depend on the specific knowledge of source parameters. 

• It would be of interest to see if the results on guessing moments for unifilar sources can be extended to finite-state sources. 

• The large deviations behavior of guessing when B ^ R is not well-understood and might be worth investigating. 

• As mentioned in [2], one might wish to consider a scenario where only a noisy version of the cryptogram is available to 
the attacker The degradation in the attacker's performance could be quantified. 
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